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Abstract 

We provide a reason for Bayesian updating, in the Bernouhi case, even when it 
is assumed that observations are independent and identicahy distributed with 
a fixed but unknown parameter ^o- The motivation reUes on the use of loss 
functions and asymptotics. Such a justification is important due to the re- 
cent interest and focus on Bayesian consistency which indeed assumes that the 
observations are independent and identically distributed rather than being con- 
ditionally independent with joint distribution depending on the choice of prior. 
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1. Introduction 

The aim of this paper is to provide a straightforward and concise justifica- 
tion of the Bayesian approach to updating probability beliefs, in the case of a 
Bernoulli sequence of random variables. It is then seen that the details of the 
result can be applied to general parametric families. The key is the use of a 
loss function combined with the notion of asymptotics. That is, a loss function 
on the space of probability distributions on (0,1) is employed which uses as 
information the prior knowledge and the observations. The general setting is 
made precise by appealing to obvious asymptotic requirements for the solution 
to the minimization of the loss function. Indeed, i nterest in Bayesian consis - 
tency has grown in the last years. See, for instance, Xing and Rannebv ( 20091 ) 
and references cited in this paper. 

The use of loss functions is limitless within the world of applied sciences, 
no more so than within the decision scienc es which includes statistics and par- 



ticularly Bayesian statistics (jBergerl . 119931 ). To set the scene, if 71 is a set of 



actions, and the loss incurred is L{a, X), where X is an outcome/observation or 
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piece of information and a G A, then the best choice is that a which minimizes 
L{a, X). On the other hand, if there are a number of pieces of information, say 
(Xi, . . . ,Xn), each of which contributes an additive loss L{a,Xi) under action 
a, then the best choice now minimizes the cumulative loss 



L(a, Xi, Xn) ^ ^ i(a, Xi). 



i=l 

Such an additive style of cumulative loss would be appropriate when the {Xi) 
are independent pieces of information. 

We are interested in the case when A is the space of probability distributions 
on (0, 1). This occurs if our aim is to choose a probability distribution represent- 
ing beliefs about a model parameter belonging to (0, 1). In this framework, we 
allow 7r(-) to be the proposed representation of beliefs about in the case of no 
observations. The distribution tt represents information, just as the Bernoulli 
observations (Xi, . . . , X„) represent information. Hence, in the case n = 0, the 
loss function is l.,r{a,TT). Maintaining the idea of cumulative loss, we now have 

n 

i(a,Xi,...,X„,7r) = ^ L(a,X,;) + /^(a,7r). (1) 

1=1 

To this point there is little justification required; we are merely writing down a 
general loss function in order to determine a probability distribution on (0, 1), 
where the only assumption is that the losses are additive or cumulative. This 
seems relevant when the pieces of information are independent; that is, no one 
piece of information provides information about any of the others. To better 
indicate that is a set of probability distributions, we will now replace a with 

V. 

The first and straightforward loss function to discuss is tt). So, tt) 
is the loss when v is the probability measure correctly representing beliefs (and 
indeed with consistency it will end up providing correct beliefs) and tt is the pro- 
posed probability measure representing beliefs at the outset. Therefore, lTr{v, tt) 
can be interpreted as a loss in information. It is reasonable to require v to be ab- 
solutely continuous with respect to tt. Indeed, the updated probability should 
be zero on every event whose prior tt probability i s zero. Therefore, l-n ii', tt) 
can be t aken to be the g-divergcncc, introduced bv lAli and Silvev (Il966h and 



Csiszarl (jl96l1 ). i.e. lTr{v^ — J 5(di^/d7r) dv, where g is a convex function such 



that g{l) = 0. Such a family of divergences is known to be a genera l izatio n 



of the KuUback-Leibler divergence, introdu ced bvlKullback and Leibleij (jl95l[ ) 



which is obtained taking g{x) = — log(a;). Bissiri and WalkeiT i 2009 ) establish 



that among the ^-divergences, the KuUback-Leibler divergence is the only one 
which preserves a necessary coherence property whereby the solution at stage 
n serves as the prior for subsequent observations. Hence, ^^(1/, tt), the loss in 
information in using tt rather than i/ is taken to be the KuUback-Leibler diver- 
gence. 

Our aim now is to ascertain how tt changes to v, in the light of the in- 
formation {Xi, . . . ,Xn), for an apparent arbitrary loss function L{v,X). Our 



2 



form for this seems obvious in the sense that if we select L{0, X) then we can 
merely take L{v, X) = J l{9, X) i/(d0), since the v represents beliefs in 9 and so 
L{i',X) is understood as expected loss. Surprisingly now, an obvious asymp- 
totic requirement will pin down 1{9,X) precisely. The followi n g can also be 



seen as providing an explicit answer to a suggestion in IWalkerl ()2006l . Section 
6) about a possible justification of the Bayesian paradigm through the loss H]) 
and asymptotic requirements. So, while loss functions are typically regarded 
as a subjective choice, an objective choice based on asymptotic properties for 
the v provides justification for the Bayesian learning process. All the proofs are 
deferred to the Appendix. 

2. Preliminaries 

Denote by (Ar„)„>i the sequence of observations which are i.i.d. Bernoulli 
with parameter 6*0. Assume they are — 1 random variables on a probability 
space {n,^,Pe) and Pe{Xn = I) = 9 for each n > 1. Denote by tt the prior 
distribution for 9. So, tt is a probability measure on the Borel subsets of (0, 1). 
In the rest of the paper, it will be assumed that tt is absolutely continuous with 
respect to the Lebesgue measure A and that there is a version of dir/dX that is 
a continuous function. 

The observations Xi , X2 , . . . are usually considered c onditionally indepen 



dent a nd identically distributed given 9. See for example [Bernardo and Smith 



(|1994) . This makes possible to update the prior tt by Bayes' theorem, obtaining 
;erior distribution for 9 

7r("'(A) :=7r(A|Xi,...,X„) 



the posterior distribution for 9 

X4 6'"^"(l-e')"(i-«")7r(d6l) 



/(„^^)0"«"(l-^)"(i-''")7r(d0) 

where ^ is a Borel subset of (0, 1) and 9n := - J27=i ^i- 

Applying Bayes' theorem to obtain the posterior distribution, the obser- 
vations are not considered independent, but conditionally independent given 9. 
Since wc are assuming that the observations arc independent, we are uncomfort- 
able with the notion of a Bayesian model which artificially creates a dependence 
between the observations. However, as it will be soon clear, the posterior dis- 
tribution also arises as the solution of a minimization problem, which does not 
require such an assumption of dependence for the observations. 

Following Section 1, we consider the loss ^ taking 1{9,X) = —\n{Pg{Xi = 
X) / Pgg{Xi = X)), the self-information loss function. So, the loss function ([T]) 
becomes: 

L{v) : = L{v,xi, ...,Xn,Tr) 

(2) 



E 



ln(Pe(^i - x,)/Pe,iXi = x,)) iy{d9) + D{u,t:) 



, = 1-^(0,1) 

where (xi, . . . , a;„) is a sample drawn from (Xi, . . . , X„), is a probability 
measure on (0, 1) absolutely continuous with respect to tt, and D denotes the 
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KuUback-Leibler divergence (relative entropy), i.e. 

i?(Qi,Q2) = ^in(^^^ dgi, 

where 5* is the support of Qi, for any couple {Qi,Q2) of probability measures 
such that Qi <^Q2- 

Notice that the first addendum in ^ depends on the sample and attains its 
minimum when u = dg , i.e. is degenerate at 9m while the second term takes 
into account only the prior belief about 9 expressed by tt. It is clear that the 
posterior tt^") minimizes the loss L, since 

L{iy)^D{iy,Tr("^)~ln( f f\Pg{X, = x,) 7r{d9)) + y2HPe„{Xi ^ x,)). 
V-'(04),=i / ,=1 

We could stop here since we have a justifiable loss function the solution of 
which is the Bayesian posterior distribution. However, we can work with a 
more general loss function and establish through asymptotic arguments that 
the — log loss is the best in some sense. 

So, an obvious and general alternative for the loss function in 1^ is 



n ^ 

LfH--=T. f{Pe{X,^x,))v{A9) 



'(0,1) 

where the function — In(-) has been replaced by a function / from (0, 1) into the 
non-negative real line including +oo. Clearly this is appropriate as f(Pe{Xi = 
x)) is either f{9) or /(I — 9), depending on whether x is 1 or 0. Denote by TTy""* 
the probability measure that is absolutely continuous with respect to tt with 
density 

g-n{e„/(e)+(i-e„)/(i-e)} 



/(o 1) e-"{«"/(*)+(i-»'.)/(i-*)} d7r(0 
The probability measure tt^^^ minimizes L f since 

A referee has pointed out connections with Lagrange functions, ttj"'' and the 
unique minimization of L/. 



3. Theory 

Our aim is to properly choose the function / within the class C^(0, 1), apart 



from an additive constant. In fact, for any real constant c, 7rl"''(-) = ir^plci') 
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It will be shown which conditions on / are necessary and sufficient for the 
(strong) consistency of tt^^'. Next, a criterion will be defined that makes /(•) = 
— In(-) the best choice and therefore the Bayesian posterior tt'-"'-' the best one. 

3.1. Consistency 
Assume that 

7r(0o - e, 00 + e) > 0, (4) 

for every e > 0. Since 6*0 is unknown, this means that only priors whose support 
is the unit interval will be considered. It will be convenient to express the 
posterior in the following form: 

f /(o_,)e-"'^M")d^(t)' 

where 

y) := y {f{x) - f{y)) + (1 - y) (/(I - x) - /(I - y)) (6) 
for every (x, y) in (0, 1)^. 

Proposition 1. Let f he a function of class C^(0,1) and let d he a function 
defined on (0, 1)^ by 

Then the following facts are equivalent: 

(i) For every 9q, e > 0, and every prior tt satisfying 

T^f'\% - e,0o+ e) — > 1, oo, Pg^ - a.s. (7) 

(ii) For every {61^62) in (0,1)^, 

a) d{0i,02) > 

b) if 61^62 then d{ei,d2) > 0. 

(iii) For every < a; < 1, 

a) xf'ix) = il^x)f'{l~x), 
h) fix) < 0. 

In the rest of the paper, it will be assumed that tt is absolutely continuous 
with respect to the Lebesgue measure and that its density is continuous on 
(0, 1). The following proposition determines the rate of convergence of t^^^"^ ■ 
In its statement and in the rest of the paper, the minimum between two real 
numbers x and y will be denoted by x A y. 

Proposition 2. If the hypotheses of Proposition\^are satisfied, then, as n — 00, 
4"H(^o ^e,eo + er) X ^ e-"(*(-)) Pe, ~ a.s., (8) 
where S{e) := d{9Q - e, ^o) A d{9o + e, Oq). 
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3.2. The choice of the loss function. 

Here we study the large sample property of the posterior, and this can be 
done by considering the posterior variance given 0„. 

Proposition 3. IfVf{6n) denotes the variance with respect to the distribution 
TTj"'' and f satisfies the conditions of Proposition [H then 

lim nVf{§„) ^ Peo - a.s. (9) 

So, the limit depends obviously on smd it is clear how; the Fisher infor- 
mation is I{9o)^^ cx 6o{l — 9o) and so for larger values of I{9a) we will have a 
faster rate of convergence since there is more information in the data for such 
9o. Moreover, the information tells us how the convergence depends on ^o- This 
is amplified by the speed at which the 9n converges to 9o, and is proportional 
to 6*0(1 — ^o)- So, the 6*0(1 — 6*0) term in the limit of (|9]) is taking account of the 
value of 6*0. The other term should therefore not depend on 6*0. 

The reason for this is quite simple: if the 9q f'{9o) does depend on 6*0 then 
we should be able to modify / so that all 9q obtain the largest value of |6'/'(6')|. 
Hence, the optimal / must indeed make this a constant and so we must take 
—xf'{x) = AI for some constant M > 0. Hence we have f{x) = —Minx. 

Hence, we now just need to ascertain the reason why we should make M = 1; 
since we have established that we must have f{x) — —Minx and the Bayesian 
learning rule is obtained precisely with M ~ 1. Suppose the choice 6* = 6*1 is 
chosen stubbornly so that tt{9) — 5ei(^)- Hence, since will always represent 
beliefs, according to definitions in Section 2, 



and so our expected loss for n observations is 

Li6e,)=nMDiPg„,Pe,). 

We can understand that our loss up to a sample of size n when fixing 9i ; it is 
predicting with the wrong measure, i.e. Pe^ instead of Pe„ on n occasions. So 
our loss is nD^Pffg, Pg-^), being consistent with using D{i',t:) in Section 1, and 
hence we must fix Af = 1. 



4. Discussion 

We have constructed a loss function for selecting an updated belief proba- 
bility measure on (0, 1) in the light of i.i.d. Bernoulli random variables. Having 
started out with a general form, the precise function can be pinned down by 
appealing to some necessary asymptotic properties. The consequence is that the 
Bayesian learning machine, in the Bernoulli case at least, can be understood via 
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notions of loss functions and asymptotics and while retaining the correct notion 
of an i.i.d. sample. 

We believe that the ideas in this paper can be extended to the more general 
case; in the first instance for parametric models f{x;9) and subsequently for 
nonparametric models /(x), / S where now the decision space consists of 
probability measures on H. 

Appendix 

In order to prove Proposition [U the following lemma will be useful. 
Lemma 4. Let f be a function of class C^(0, 1) such that 

a) xf'{x) = {l-x)r{l~x), 

b) fix) < 0, 

for every < x < 1 . Moreover, fix < 6 < I and define 

^{x) = ef{x) + {i-e)f{i~x), (10) 

for every < a; < 1. 

Hence, ip is a function of class C^(0, 1) such that 

^'(x) = ^f'ix). (11) 

I — X 

Moreover, ip has the second derivative at 9, which is equal to 
Proof By 

p'ix) - efix) - {i-e)f'{i~x). (13) 

A combination of ([T3| with (a) yields (fTT]) . Since /' is a continuous function, 
pT|) entails 

p'{x)^p'{e) ^ _m 

x^e x-O 1-9' 
and ((T^ is proved. □ 

Proof of Proposition[l\ Let A'^ := (0, 1) \ ^ denote the complement of subset A 
of (0,1). 

To begin with, notice that by (O 



f\eo-e,eo + e)= I 1 + 
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and therefore ([7]) is tantamount to 



lim — = Pe„ - a.s. (14) 

Let us prove that (ii) is necessary for (i) . To this aim, assume that (i) is true 
and fix < ^0 < 1 and a probabihty measure tt satisfying Hence, by virtue 
of (i), TT must satisfy ([7]) as wch. 

Since / is continuous, the function d{-,9i) is continuous as weh for every 9i in 
(0, 1). In particular, for every 9i, it is continuous at 9i where its value is zero; i.e. 
for every fc > there exists some e > such that infg. \0-g^\ <2e d{9,9i) > —2k. 
Moreover, by the strong law of large numbers, \9n — 9o\ < e for sufficiently large 
n P^Q-a.s., and therefore 

inf d{9,9n) > inf d{9,9n)>-2k, 
9:\e-eo\<e e:|e-e„|<26 

for sufficiently large n, Pe^-a.s., for every fc > and some e > 0. Hence, by the 
dominated convergence theorem. 



n— )-oo 



hm e"2«/c I e""'*(^'«"V(d0) 

(Ba-efia+e) ^^^^ 



(do-efia+e) 



for every fc > and for some e > 0. 

Now assume that there is some M such that d{9, 9q) < M for every 9 be- 
longing to some set C such that tt{C \ {9o}) >0. Take e > small enough so 
that 7r((6'o — £,6'o + £)'^nC) > 0. Notice that, by the strong law of large numbers 
and by continuity of d{9, •), for every 9 e (0, 1), d{9, 9,,) - d{9, 9o) < M for suf- 
ficiently large n, Pg^-as. Hence, for every 9 ^ C , d{9, 9„) < 2M for sufficiently 
large n, Pe^-a.s. and, by Fatou's lemma. 



liminfe^"*^ / e-^''^''''"\{d9) 



> f liminfe"(2Af-d(e.e„))^(^^)^^^ 



So, combining ([T5|) and ([T6|) . one notices that 



(16) 



holds for sufficiently large n, Pgg-a.s., for every fc > and some e > 0. Since 
(|14p holds true for every e > 0, then M > —k for any real positive number fc, 
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i.e. M > 0. So, M > whenever d{6,do) < M with positive 7r-probability. 
Therefore, d{9, Oq) > 0, 7r-a.s. Hence, 



lim / e-"''('''^"V(d6i) 

(eo-e,«o+e) 



hm e-"''(^'''"V(d6') 

'{ee(0o-£:0o+e):d(e,eo)>o} 

/■ - s ^^^^ 

^TT{ee{eo-e,eo + e) -. d{0,eo)^o} 

< 7r(6'o - £, 6*0 + e)} - a.s., 

holds true PgQ-a.s., by dominated convergence theorem. Assume there is some 
M such that d{9, 0o) < M, for every 9 belonging to some set D with positive tt- 
probability and such that Oq ^ D. Hence, combining ([T5)) and ([T51) . one notices 
that 

holds true for sufficiently large n, Pg^-a.s., for sufficiently small £ > 0. Since 
holds true for every e > together with Q, then must be positive. 
So, M > whenever d{d,do) < M for every 6 6q with positive 7r-probability. 
Therefore, d{9, 9q) > 0, for every 9 ^ 9q n-a.s. Since this is true for every 9q 
and every tt whose support is the unit interval, (ii.b) must hold. Notice that 
(ii.b) trivially entails (ii.a) since d{9, 9) = for every 9 by definition of d. 

At this point, it will be proved that (ii) implies (iii). To this aim, define 
ip{x) := d{x,9) for a fixed < 9 < 1. Since d{9,9) = 0, condition (ii) is 
tantamount to say that the function (p has an absolute minimum at x = 6* for 
any 9. Therefore, if (ii) is in force, (p'{9) = must be true for every 9 in the 
unit interval and condition (iii. a) follows. 

By Lemma m (iii. a) entails 

p\x) = ^—^.nx), (20) 

1 — X 

where ip' is a continuous function since /' is so. Since 9 is an absolute minimum 
point for (p, there is (5 > such that if'{x) >0ii9<x<9 + 6 and ip'{x) < 
ii 9 — S < X < 9. This is tantamount to say that f'{x) < for every x in 
{9 — 6,9) U {9, 9 + 6) for some 6. Since this must hold for every 9, condition 
(iii.b) follows. 

Finally, it will be shown that (iii) is sufficient for (i). To this aim, notice that 
if (iii) holds then ((20|) is also in force by Lemma|4]and therefore d{-, 9) is (strictly) 
decreasing on (0,9) and (strictly) increasing on (9,1), for every 9. Therefore, 
for every e > and every < 6*1 < 1, d{9,9i) > 6{e,9i)/2 if 16*- 9i\ > e and 
6{e,9i) denotes d{9i — e,9i) A d{9i + e,9i). Applying dominated convergence 
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theorem, this entails that 

lim e"^(^'^")/2 / ^-ndie,§^) ^^^q^ 



(eo-efio+e)' 



n— J-oo 



By continuity of d{-,9n) at 6'„, for every rj > there exists 7 such that 
d{9,6n) < ?7 if |0 — < 27 and by the strong law of large numbers — ^ol < 7 
for sufRciently large n, Pgg-a.s. Therefore d{9,9n) < r/ if \d — 9o\ < 7 for 
sufSeiently large n, Pg-Q-a.s., and by Fatou's lemma, 



liminf e"" / e"" 7r(d6l) 

> / liminf e"(''-''(»^^")) 7r(d6') = C30, 

J(eo-7,eo+7) """"^ 

for every ry > and some 7 > 0, Pg^Q-a.s.. 
Combining ([21]) and (|22)) . one obtains that 

n^oo e-" 7r(d6') 

for every 77,£: > 0. Taking -q < 5(e, 0o)/2, (i) follows. 



(22) 



□ 



By the strong law of large numbers, there exists a Borelian subset B of 
{0,1}°° with Pg;, -probability one such that 0„(a;i, . . . , x„) = '^O'^" 
verges to 9o for all sequences (a;„)„>i belonging to B. In the rest of this ap- 
pendix, 9n will stand for 9n{xi, . . . , a;„) and we shall always assume that (x„)„>i 
belongs to B. 

In order to prove Proposition [2] and Proposition [31 the following lemmas are 
useful. 

Lemma 5. // (g„)n>o *s 0, sequence of non-negative functions on (0, 1) domi- 
nated by an integrahle function, then for every 5 > Q there are ?7i , cq > such 
that 

I e-nd(t,e„) < g-n^i ^23) 

J{§„-s, e„+(5)<= 

/or sufficiently large n. 

Moreover, i/(c„)„>i is a sequence converging to a positive number, (g^)n>i 
is a sequence of integrahle functions on R, and 

I e-="(*-«"")'/2 5:(t)ds < c, (24) 

J ( — 00, cxd) 
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for some real constant c and for sufficiently large n, then 



1(9. 



e 



.c„(t-e„)-/2g*(i < fce-"''% (25) 



for some k, r]2 > and for sufficiently large n. 

In the rest of the paper, the maximum between two real numbers x and y 
will be denoted by x V y. 

Proof. Let Xn be a nonncgative and differentiable function on (a, b) {—oo < 
a,b < oo) with an unique absolute minimum at 9n and such that 

X'nit)<0 if a<t<§r, 

X'nit) > if ^„ < i < &. 

Hence, if (5 > and \t — 9n\ > S then Xri(^) > Vn{S), where ry„((5) := Xn{&n — 
S) V XniL + S). Notice that -nxn{t) < -{n - 1)7m((5) - Xn{t) if \t - O,,] > S 
and n > 1. Therefore, given some sequence of measures {fin)n on the Borelian 
subsets of (a, &), 

/ e-"'^"(*Vn(dO < e- / e->^"(*Vn(di). (27) 

J{\t\>S} J(o.,b) 

Taking Xn{t) ~ d{t, ^„), a = 0, 6 = 1, holds true. Moreover, the integral 

e-^"^*^^lnidt) (28) 

(a,b) 

is less than a constant, if d/^„/dA = gn and A is the Lebesgue measure. In fact, 
Xn is nonnegative and (?„ dominated. Since i]n{6) = d{9n — S, On) V d{9n + <5, On) 
converges to a positive constant by the strong law of large numbers, (j27p yields 

(ESI). 

If Xn(^) — Cn{t — 0„)^/2, a — — oo, 6 oo, then is satisfied. Moreover, 
if d/x„/dA(t) = .g*(t — 0„), then the integral ([25]) turns out to be equal to the 
integral in Therefore, follows from ^7^. □ 

Lemma 6. Let {gn)n>i cind (g^)n>i be two sequences of nonnegative, continu- 
ous and integrable functions defined on (0, 1) and M, respectively, and such that 
9n{t) ~ gn{t) as t^ On, for^ every n > 1. 
Let dn{t) stand for d(t, On), and denote: 

In:= [ e-"^"(*)g„(i)dt, (29) 

In{x):= [ e-"«(^")--)(*-^")'/2^:(t)di, (30) 

'J ( — OO, oo) 
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Assume that 

lim e-"V(/„(.T)-/„(y)) =0, (31) 

for every c > 0, and every y belonging to some neighborhood of zero. More- 
over, let (psj) hold with c„ = d'^{9n), for some positive constants k,r]2- 
Therefore, /„ ^ /n (0) as n — > oo. 

Proof. Th i s pro of will be based on the Laplace method. See, for instance, 



de BruiinI (|l98ll pp. 63-65). His results do not precisely fit our needs and 



therefore we have to prove this lemma starting from scratch. 

Recall that dn{9n) = 0. Moreover, by hypothesis, d„ has a unique minimum 
at 9n so that d'^[9n) = 0. By Taylor's theorem, for each n > 1 and each e > 
there exists (5„ > such that if |t — 6*0! < 6n then 

\dn{t) - \ ^(^n) {t - Kf\ < e{t- 9r,f. (32) 

It will be useful to observe that (5„ can be taken constant for sufficiently 
large n. In order to show this fact, define 

Mt) ■■= dn{t) - U';{9n){t-9nf, 

SO that V«(4) = Ki^n) = = 0. By (HH) and (HH), 

^;(i) := dUt) - <(a,o(t-e"„) 

Recall that < 6*0 < 1 and fix < 7 < (1 - 6*0) A ^o- By hypothesis, the 
function f'{t)/{l — t) is continuous over the compact set [6*0 — 7,6*0 + 7] and 
therefore is uniformly continuous over that interval. Moreover, recall that by 
the strong law of large numbers, 0„ belongs to [6*0 ~ ^ , 9i^ + ^] \i n > N for some 
N . Hence, by ((33)) for every e > there exists 5 > Q such that 

Wn{t)/{t - e,,)\ = |/'(t)/(l - - - ^n)l < e (34) 

if |t — < 8 and n> N . By Lagrange's mean value theorem. 

Mi) = Mt) - M^n) ^ (t- kWn{s) (35) 

for some s between t and 6'„. Combining (j34p with (|35|). one obtains 

\Mt)\ = l(i-^nX(s)l < e\t - 9n\\s - 9n\. 

Since |s — < |t — ([32)) holds true for every t e (6*0 — 5, 9q + 5) and every 
n>N. 

For every n> 1, gn{t) ^ 5n(i) as i — > by hypothesis. Hence, the function 

^ he,^Yit)9n{t)/gl{t) + I{e;,}(t) 
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is continuous on the compact set [0o — 7, + 7] and therefore uniformly con- 
tinuous on that set. For this reason, for each e > there is (5 > such that 

g*n{m-e) < gn{t) < gUm + e) (36) 

if 1^ — 0ri| < ^ and n is sufficiently large. 

At this stage, fix e belonging to ( 0, -/'(6'o)/{4(l - 6*0)} ) so that 

£ < -f{On)/{Hl - 0n)} = <(^n)/3 (37) 

for n > M and some M > N. Moreover, take S > small enough so that p2p 
and (p6)) arc both satisfied. 

Decompose the integral /„ defined by ((29)) in the following way: 

In= [ e-"''"(*'5n(i)di+ / g„(t)dt. 

The first term can be bounded by (|32]) . the second one by pB]). obtaining 

I„ > [ e-"«(^")+2-)(*-^")'/2 5„(0dt 

J{§„-S, §„+S) 

I^< f e-"«(^")-^^)(*-^"")'/2g„(t)dt + coe-"''\ 

which in virtue of (|36p becomes 

/„ > (1 - e) / g-n«(e„)+2s)(*-e„)V2 (38) 

In<{l+e)[ e-"(<(^")-2^)(*-«^")'/2g*(i)dt + coe-"''i. (39) 

By hypothesis, (^5]) holds true with c„ = d'^{9n), for some positive constants 
fc,772- Therefore, psp becomes 

/„ > (1 - e) / e-"('^'"(^"")+2-')(*-«""^'/2^;(t)dt - (1 - e)fce-"''^ (40) 

J ( — oo, oo) 

Recalling ([50]) . the combination of (f^(I)) and ([59)) yields 

(1 - £) /„(-2e) - (1 - £)fce-"''^ < /„ < (1 + e)In{2e) + cq e"""! (41) 
for n sufficiently large. By ([5l]) , if n is sufSciently large, then 

e-"i" < (l + e)(/„(3e)-/„(2e))/co 
e"''^" < (/„(-2£)-/„(-3e))/fc, 
being /„(x) an increasing function of x. Therefore, by (|4ip . 

(1 - £) /„(-3£) < /„ < (1 + £) /„(3£) (42) 

holds true for sufficiently large n. The number e being arbitrary, it follows that 
In ~ ln{0) as n — > oo 

□ 
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Lemma 7. Let dn{t) stand for d{t, On)- If the hypotheses and the conditions of 
Proposition\^hold true, p is an integrable, nonnegative and continuous function 
on (0, 1) and < a < 9o < b < 1, then 



pit)e-'^-Wat^^^^iM, (43) 



p{t) e-^'^-f*) {t - Kf di ^ x/2^p(^„) {n<(^„)}"'/' (44) 

(0,1) 

p{t) e-"'^"^*) t^ di ~ V2^p(^„) {n<(^„)}"'/' (^n + {n<(^„)}-i), 

(0,1) 

(45) 

p{s) e-"'^"(^) ds ~ - ^ e-"''"W, (46) 
(b, 1) c'U^) 

p(,)e-nrf„(.)d5^„i|Re-"'^"('^), (47) 
(0,a) aUa) 

as n — > oo . 

Proof. Lemma [S] will be applied to prove (|33]), (|33|) and Three cases will 

be considered: 

Case A) 5„(t) p(t), ^^(t) ^ p{K)\ 

Case B) 5„(t) - p(t)(i - ^„)^ ^^(O = p{K)(t - ^„)'; 

Case C) gn{t) = p{t)t^, gl{t) = pik^] 

Notice that the integral /„(x) defined by ((201) is finite if x < - /'(6'o)/{2(l-e'o)} 
and n is sufficiently large. In fact, by the strong law of large numbers, this 
entails that x < — /'(0„)/(l — 0„) for sufficiently large n, and therefore, by (|12p 
in Lemma m d"(0„) — a; is positive. 
Ifx <-/'((?o)/{2(l-f?o)},then 



V '^('^nl^'n) - a;) 

where VF„ is a Gaussian random variable with mean ^„ and variance {n((i"(0„)- 
x)}~^. Therefore, 



Case A) /„(.t) = V2^p(0„) {n«(0„) - a:)}-i/2, 
Case B) /„(x) = V2^p(0„) {n«(^„) - x)}-3/2, 

Case C) /„(x) = V2^p(0„) {n«(4) - x)}-^/^ (§1 + {n(d"(^„) - x)}-i). 
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if a; < — /'(6'o)/{2(l — Oq)} and n is sufficiently large. 

In all three cases, ^ holds true if c > 0, |a;| , \y\ < -f{9o)/{2{l - 60)}. 
Moreover, the integral in ([M)) with c„ = nd'^{6n) is equal to 

Case A) V2^p{9^) {nd'^{9,,)}~^/\ 

Case B) p((?„) {nd'^{9,,)}-^/\ 

CaseC) V2^p{L){nd';{L)}-'/^ 01 + {n d'^ik)}-'), 

which converge to zero by continuity of d" and p, and by the strong law of large 
numbers. Therefore, (|24p is satisfied in all three cases if n is sufficiently large. 
This allows us to apply Lemma [S] and to obtain that ([?5|) holds true for some 
positive constants k, r]2- 

Since (|3T|) and ((25)) hold for some positive constants k,r]2, Lemma |6] can be 
applied and (gS]), (gl]) and (gS]) are proved. 

At this stage, our aim is to prove PS|) and (|T7)) . The Laplace method will 
be used again. By continuity of the function p, for each e > there is (5 > 
such that 

pib)il - e) < p{t) < p{b){l + e) (48) 

ifb <t <b + S. 

Since / is continuous, the functions 

t ^ fit) - fib) -it- b)f'ib) 

t /(I -t)- /(I -b)-ib- i)/'(l - b) 

are continuous at 6 and at 1 — &. Therefore, for a given e > we can fix (5 > 
such that 



d(t, s) - dib, s) - it- b) 



ddix, s) 



dx 



x—b 



<.s\ifit)-fib)-it-b)f'ib))\ 

+ (1 - s) 1/(1 -t)- /(I -b)-{b- t)f'il -b)\<e 

holds for every t & {b, b + 5) for and every s € (0, 1). 
Hence, for a given e > 0, we can fix (5 > such that 



holds true together with 
Denote 



\dnit)-dnib)-it-b)d[,ib)\<e 
for every t e (&, b + 5). 



(49) 



p(t)dt. 



Since (i„ is increasing on (6 + (5, 1) C (^o, l)i 



-nd„{t] 



pit)dt 



(b, b+5) 



-ndn {t 



^pit)dt 



{b+S, 1) 



< 



(50) 



(b, b+5) 
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where k = /^^^^ ^^p{t)dt. Therefore, we can write 

Jib, 6+5) J{b, 6+5) 

which yields, by pS)) and (^^1) . for sufficiently large n, 

^(6, 6+5) 
^(6, 6+5) 

that is 

{p{b) - e) J„(-£) < J„ < {p{b) + e) J„ {e) + fc e-"^"(''+^), (51) 

where 

1 _ -n{d'^(b)-x)S 
J (r) — p-nd„{b) 

At this stage, denote 

g-"d„(6) 



J„(x) 



nK(&)-x)- 

Fix X < {9q- 6)/'(6)/{2(1 - 6)}, so that x < (i^(&) for sufficiently large n. If n 
is sufficiently large, then (1 — e) J„(x) < Jn(x) < Jn{x). Hence, (fSTj) becomes 

(p(&) - e) (1 - £) J„(-e) < Jn < {p{h) + £) Jn{e) + fce-"'^"(''+^\ (52) 

Since dn is increasing on (&+5, 1) C (0o, 1), e~"''"'^+*^ = o{Jn{x)) as n — >■ oo 
for X < (6*0 - 6)/'(6)/{2(l - 6)}. In fact, d„(&), <(&) and d„(& + 5) converge to 
positive constants by the strong law of large numbers, / (and therefore dn and 
d'n) being continuous. Hence, ((52|) yields 

(p(6) - e)(l - e) Jni-e) < J„ < {p{b) + 2s) J„(e) 

for sufficiently large n. The number e being arbitrary, it follows that 

Jn P{b) Jn{0) 

as n oo and (j46p is proved. 

In order to prove (|T7|. take d„ (t) := d{t, l-9n) = dn{l-t),p{t) :^ p{l-t), 
00 := 1 — 6*0 b := 1 — a (so that b > do) and notice that by (|46l) 

/ p(s) e-"'^"^^) ds ~ i e-"'^"^^), 
^(6, 1) " dn (b) 

and then apply the substitution f = 1 — s in the integral. 

□ 
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Proof of Proposition\^ Let p = d7r/dA, where A is the Lebesgue measure. In 
order to apply Lemma [3 notice that 

L« ,e-"'^(''^^")p(0)d0+ r„ ^ ,,e-"'*(«^^")p(0)d0 
^ ^ ^ /(o^^)e-«'^(»^e")p(0)d0 

Hence, combining p3)) with (|T7)) and (|46|. the thesis follows. 

□ 

Proof of Proposition^^ Denote by p(-) the density of tt with respect to the 
Lebesgue measure. To begin, notice that 

(0,1) ' f ^ ^ L.,e-"'^M")d^(t) 

- (53) 

/(o^^)PWe-"^(*''^")di 

Lemma [7] can be applied for both the numerator and the denominator of (|53p . 
Combination of with (gS]) and (jH]) yields 

(t-4)^4"^(dt)^— i^, (54) 
(0,1) nd'^{9n) 

as n — >■ cx). 

Combining with (fT^ . one obtains that 

(t-4)2 4")(di)^-^-^, (55) 

(0,1) ^ / ^ ^ „//(0„) 

as n — >■ cx). 

In virtue of continuity of /', by the strong law of large numbers, (j55p entails 
that 

/ it-^n?7.f\dt)^-^-^ (56) 

J(o,i) nf [Uo) 

as n — > oo. 



(») 



Let Ef{9n) denote the the mean with respect to the distribution tt^ 
Notice that 

Vfik) = I {t- kf - {Ef{e^) - 4)2, (57) 

i(0,l) 



and 



/ (i-4)'4"^(dt)= / f T,f\dt) - 2e,J tTT^;\dt) + §1 (58) 

'(0,1) "'(0,1) J{0,1) 
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By ([55)). we obtain that 



< 



1 
1 

26„ 



t 



2 ^(n) 



(0,1) 



(0,1) 



f 'Kf\dt)-el 



{t-9nf^f\dt) 



(0.1) 



+ / {t-enf^f\dt) 

'(0,1) 



(59) 



At this stage, dividing (|45|) by (|43| and applying (|T2|) and the strong law of 
large numbers, one obtains that 



t TT. (dt) -tin^ mFT- 

(0,1) ^ nf'{9o) 



In virtue of and (pO]) . equation ([5^ yields: 



(60) 



Hence, {Ef{9n) — dnY is negligible with respect to (|56|) and therefore ([57|) entails 
that 



(0,1) 

The thesis follows from (pT|) and ([55]) . 



(61) 
□ 
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